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ABSTRACT 

In this work, we study a basic and practically important strategy to 
help prevent and/or delay an outbreak in the context of network: 
limiting the contact between individuals. In this paper, we intro- 
duce the average neighborhood size as a new measure for the de- 
gree of being small-world and utilize it to formally define the de- 
small-world network problem. We also prove the NP-hardness of 
the general reachable pair cut problem and propose a greedy edge 
betweenness based approach as the benchmark in selecting the can- 
didate edges for solving our problem. Furthermore, we transform 
the de-small-world network problem as an OR- AND Boolean func- 
tion maximization problem, which is also an NP-hardness problem. 
In addition, we develop a numerical relaxation approach to solve 
the Boolean function maximization and the de-small-world prob- 
lem. Also, we introduce the short-betweenness, which measures 
the edge importance in terms of all short paths with distance no 
greater than a certain threshold, and utilize it to speed up our nu- 
merical relaxation approach. The experimental evaluation demon- 
strates the effectiveness and efficiency of our approaches. 

1. INTRODUCTION 

The interconnected network structure has been recognized to 
play a pivotal role in many complex systems, ranging from nat- 
ural (cellular system), to man-made (Internet), to the social and 
economical systems. Many of these networks exhibit the "small- 
world" phenomenon, i.e., any two vertices in the network is often 
connected by a small number of intermediate vertices (the shortest- 
path distance is small). The small-world phenomenon in the real 
populations was first discovered by Milgram 11131 . In his study, 
the average distance between two Americans is around 6. Several 
recent studies Illlll4| [8l offer significant evidence to support sim- 
ilar observations in the online social networks and Internet itself. 
In addition, the power-law degree distribution (or scale-free prop- 
erty) which many of these networks also directly lead to the small 
average distance JTj. Clearly, the small- world property can help fa- 
cilitate the communication and speed up the diffusion process and 
information spreading in a large network. 

However, the small-world effect can be a dangerous double-edged 
sword. When a system is benefited from the efficient communica- 
tion and fast information diffusion, it also makes itself more vulner- 
able to various attacks: diseases, (computer) virus, spams, and mis- 
information, etc. For instance, it has been shown that a small-world 
graph can have much faster disease propagation than a regular lat- 
tice or a random graph 1151 . Indeed, the six degrees of separation 
may suggest that a highly infectious disease could spread to all six 
billion people living in the earth about size incubation periods of 
the diseases |15| . The small- word property of Internet and WWW 
not only enables the computer virus and spams to be much easier to 



spread, but also makes them hard to stop. More recently, the misin- 
formation problem in the social networks has made several public 
outcry [3). These small- world online social network potentially fa- 
cilitate the spread of misinformation to reach a large number of au- 
dience in short time, which may cause public panic and have other 
disruptive effects. 

To prevent an outbreak, the most basic strategy is to remove the 
affected individuals (or computers) from the network system, like 
quarantine. However, in many situations, the explicit quarantine 
may be hard to achieve: the contagious individuals are either un- 
known or hard to detect; or it is often impossible to detect and 
remove each infected individual; or there are many already being 
affected and it become too costly to remove all of them in a timely 
fashion. Thus, it is important to consider alternative strategies to 
help prevent and even delay the spreading where the latter can be 
essential in discovering and/or deploying new methods for dealing 
with the outbreaks. 

Recently, there have been a lot of interests in understanding the 
network factors (such as the small-word and scale-free properties) 
in the epidemics and information diffusion process, and utilizing 
the network structures in detecting/preventing the outbreaks. Sev- 
eral studies have focused on modeling the disease epidemics on 
the small- world and/or scale-free networks 01711151 . 0161 : in 1121 . 
Leskovec et al. study how to deploy sensors cost-effectively in a 
network system (sensors are assigned to vertices) to detect an out- 
break; in (3), Budak et al. consider how to limit the misinfor- 
mation by identifying a set of individuals that are needed to adopt 
the "good" information (being immune in epidemics) in order to 
minimize those being affected by the "bad" information (being in- 
fected in epidemics). In addition, we note that from a different 
angle (viral marketing), there have been a list of studies on the in- 
fluence maximization problem [18. 9], which aim to discover a set 
of most influential seeds to maximize the information spreading in 
the network. From the disease epidemics perspective, those seeds 
(assuming being selected using contagious model) may need par- 
ticular protection to prevent an outbreak. 

In this work, we study another basic and practically important 
strategy to help prevent and/or delay an outbreak in the context 
of network: limiting the contact between individuals. Different 
from the pure quarantine approach, here individuals can still per- 
form in the network system, though some contact relationships are 
forbidden. In other words, instead of removing vertices (individ- 
uals) form a network as in the quarantine approach, this strategy 
focuses on removing edges so that the (potential) outbreaks can be 
slowed down. Intuitively, if an individual contacts less number of 
other individuals, the chance for him or her to spread or being in- 
fected from the disease (misinformation) becomes less. From the 
network viewpoint, the edge-removal strategy essentially make the 



underlying (social) network less small-world, or simply "de-small- 
world", i.e., the distances between individuals increase to delay the 
spreading process. In many situations, such a strategy is often eas- 
ily and even voluntarily adopted. For instance, during the SARS 
epidemic in Beijing, 2004, there are much less people appearing in 
the public places. In addition, this approach can also be deployed 
in complement to the quarantine approach. 

1.1 Our Contribution 

Even though the edge-removal or de-small-world approach seems 
to be conceptually easy to understand, its mathematical foundation 
is still lack of study. Clearly, different edges (interactions) in the 
network are not being equivalent in terms of slowing down any po- 
tential outbreak: for a given individual, a link to an individual of 
high degree connection can be more dangerous than a link to an- 
other one with low degree connection. The edge importance (in 
terms of distance) especially coincides with Kleighnberg's theoret- 
ical model 110) which utilizes the long-range edges on top of an 
underlying grid for explaining the small- world phenomenon. In 
this model, the long-range edges are the main factors which help 
connect the otherwise long-distance pairs with a smaller number of 
edges. However, there are no direct studies in fitting such a model 
to the real world graph to discover those long-range edges. In the 
mean time, additional constraint, such as the number of edges can 
be removed from the network, may exist because removing an edge 
can associate with certain cost. These factors and requirements 
give arise to the following fundamental research problem: how can 
we maximally de-small-world a graph ( making a graph to be less 
small-world) by removing a fixed number of edges? 

To tackle the de-small-world network problem, we make the fol- 
lowing contributions in this work: 

1 . We introduce the average neighborhood size as a new mea- 
sure for the degree of being small-world and utilizes it to 
formally define the de-small-world network problem. Note 
that the typical average distance for measuring the small- 
world effects cannot uniformly treat the connected and dis- 
connected networks; neither does it fit well with the spread- 
ing process. We also reformulate the de-small-world as the 
local-reachable pair cut problem. 

2. We prove the NP-hardness of the general reachable pair cut 
problem and propose a greedy edge betweenness based ap- 
proach as the benchmark in selecting the candidate edges for 
solving the de-small-world network. We transform the de- 
small-world network problem and express it as a OR-AND 
Boolean function maximization problem, which is also an 
NP-hard problem. 

3. We develop a numerical relaxation approach to solve the de- 
small-world problem using its OR-AND boolean format. Our 
approach can find a local minimum based on the iterative gra- 
dient optimization procedure. In addition, we further gener- 
alize the betweenness measure and introduces the short be- 
tweenness, which measures the edge importance in terms of 
all the paths with distance no greater than a certain thresh- 
old. Using this measure, we can speed up the numerical re- 
laxation approach by selecting a small set of candidate edges 
for removal. 

4. We perform a detailed experimental evaluation, which demon- 
strates the effectiveness and efficiency of proposed approaches. 

2. PROBLEM DEFINITION AND PRELIM- 
INARY 



In this section, we first formally define the de-small-world net- 
work problem and prove its NP-hardness in (Subsection 12.1) ; then 
we introduce the basic greedy approaches based on edge between- 
ness which will serve as the basic benchmark (Subsection |2,2| >; and 
finally we show the de-small-world network problem can be trans- 
formed and expressed as a OR-AND Boolean function maximiza- 
tion problem (Subsection [23}- 

2.1 Problem Formulation 

In order to model the edge-removal process and formally de- 
fine the de- small -world network problem, a criterion is needed to 
precisely capture the degree of being small-world. Note that here 
the goal is to help prevent and/or delay the potential outbreak and 
epidemic process. The typical measure of small- world network is 
based on the average distance (the average length of the shortest 
path between any pair of vertices in the entire network). However, 
this measure is not able to provide unified treatment of the con- 
nected and cut network. Specifically, assuming a connected net- 
work is broken into several cut network and the average distance on 
the cut network is not easy to express. On the other hand, we note 
that the de-small-world network graph problem is different from the 
network decomposition (clustering) problem which tries to break 
the entire network into several components (connected subgraphs). 
From the outbreak prevention and delaying perspective, the cost 
of network decomposition is not only too high, but also may not 
be effective. This is because each individual component itself may 
still be small-world; and the likelihood of completely separating the 
contagious/infected group from the rest of populations (the other 
components) is often impossible. 

Given this, we introduce the average neighborhood size as a new 
measure for the degree of being small-world and utilize it to for- 
mally define the de-small-world network problem. Especially, the 
new measure can not only uniformly treat both connected and cut 
networks and aims to directly help model the spreading/diffusion 
process. Simply speaking, for each vertex v in a network G — 
(V, E) where V is the vertex set and E is the edge set, we define 
the neighborhood of v as the number of vertices with distance no 
greater than k to v, denoted as N k (v). Here k is the user-specified 
spreading (or delaying) parameter which aims to measure the out- 
break speed, i.e., in a specified time unit, the maximum distance 
between individual u (source) to another one v (destination) who 
can be infected if u is infected. Thus, the average neighborhood 
size of G, Y2vev " (v), can be used to measure the robustness of 
the network with respect to a potential outbreak in a certain time 
framework. Clearly, a potential problem of the small-world net- 
work is that even for a small k, the average neighborhood size can 
be still rather large, indicating a large (expected) number of indi- 
viduals can be quickly affected (within time framework k) during 
an outbreak process. 

Formally, the de-small-world network problem is defined as fol- 
lows: 

Definition 1. (De-Small- World Network Problem) Given 
the edge-removal budget L > and the spreading parameter k > 
1 we seek a subset of edges E r C E, where \E r \ = L, such that 
the average neighborhood size is minimized: 

J2 vev N k (v\G\E r ) 
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\E r \=L 
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(1) 



where N k (v\G\E r ) is the neighborhood size of v in the graph G 
after removing all edges in E r from the edge set E. 

Note that in the above definition, we assume each vertex has the 



equal probability to be the source of infection. In the general set- 
ting, we may consider to associate each vertex v with a probability 
to indicate its likelihood to be (initially) infected. Furthermore, we 
may assign each edge with a weight to indicate the cost to remov- 
ing such an edge. For simplicity, we do not study those extensions 
in this work; though our approaches can be in general extended to 
handle those additional parameters. In addition, we note that in 
our problem, we require the spreading parameter k > 1. This is 
because for k = 1, this problem is trivial: the average neighbor- 
hood size is equivalent to the average vertex degree; and removing 
any edge has the same effect. In other words, when k = 1, the 
neighborhood criterion does not capture the spreading or cascading 
effects of the small- world graph. Therefore, we focus on k > 1, 
though in general k is still relatively small (for instance, no higher 
than 3 or 4 in general). 

Reachable Pair Cut Formulation: We note the de-small-world 
network problem can be defined in terms of the reachable pair cut 
formulation. Let a pair of two vertices whose distance is no greater 
than k is referred to as a local-reachable pair or simply reachable 
pair. Let TZg record the set of all local reachable pairs in G. 

Definition 2. (Reachable Pair Cut Problem) For a given 
local (u, v), ifd(u, v\G) < k in G, but d(u, v\G\E„) > k, where 
E 3 is an edge set in G, then we say (u, v) is being local cut (or 
simply cut) by E s . Given the edge-removal budget L > and the 
spreading parameter k > 1, the reachable pair cut problem aims 
to find the edge set E r C E, such that the maximum number of 
pairs in TZg is cut by E r . 

Note that here the (local) cut for a pair of vertices simply refers 
to increase their distance; not necessarily completely disconnect 
them in the graph (G\E S ). Also, since 1Zg\e t ^ TZg, i-e., ev- 
ery local-reachable pair in the remaining network G \ E r is also 
the local-reachable in the original graph G, the problem is equiv- 
alently to maximize \TZg\ — \TZa\B r \ ar >d minimize the number 
of local reachable pairs \TZg\e,. ■ Finally, the correctness of such 
a reformulation (de-small- world problem=reachable pair cut prob- 
lem) follows this simple observation: YIvev N k (v\G) — 2\1Zg\ 
(a.ndJ2 v€ v Nk ( v \ G \ E r) = 2|72. G \i5 r |). Basically, every reach- 
able pair is counted twice in the neighborhood size criterion. 

In the following, we study the hardness of the general reachable 
pair cut problem. 

THEOREM 1. Given a set RS of local reachable pairs in G = 
(V, E) with respect to k, the problem of finding L edges E r C E 
(\E r \ = L) in G such that the maximal number of pairs in RS 
being cut by E r is NP-Hard. 

Note that in the general problem, RS can be any subset of TZg- 
The NP-hardness of the general reachable pair cut problem a strong 
indicator that the de-small-world network problem is also hard. The 
proof of Theorem Q~]is in Appendix. In addition, we note that the 
submodularity property plays an important role in solving vertex- 
centered maximal influence (5], outbreak detection 11121 . and lim- 
iting misinformation spreading J3] problems. However, such prop- 
erty does not hold for the edge-centered de-small-world problem. 

LEMMA 1. Let set function f : 2 E — > Z + records the number 
of local reachable pairs in TZg is cut by an edge set E s in graph 
G. Function f is neither submodular (diminishing return) nor su- 
permodular. 

Proof Sketch: For a supermodular function: we have f(A UB) + 
f(A n B) > f(A) + f(B); for a submodular function: we need 
show f(A) + f(B) > f(A U B) + f(A n B). Here we use two 



counter examples (suppose k = 2). For Gi in Figure[T] we can see 
that edge sets {ei} and {e2} cut (cut) respectively two reachable 
pairs {(ab, ac} and {ac, be}. Then we have /({ei}) + /({e2}) = 
4; however, /({ei,e2}) + /(0) = 3. Therefore, supermodularity 
does not hold. For G2 in FigureQ] we can see that edge set {ei} 
and {ea} each can only cut one reachable pair. However, {ei, 63} 
could cut four pairs. That means, /({ei, 63}) + /(0) > /({ e i}) + 
l/({ e 3/)- Therefore, submodularity can not hold. □ 




Figure 1 : Counter Examples 

2.2 Greedy Betweenness-Based Approach 

Finding the optimal solution for the de-small-world problem is 
likely to be NP-hard. Clearly, it is computationally prohibitive to 
enumerate all the possible removal edge set E r and to measure how 
many reachable pairs could be cut or how much the average neigh- 
borhood size is reduced. In the following, we describe a greedy ap- 
proach to heuristically discover a solution edge-set. This approach 
also serves as the benchmark for the de-small-world problem. 

The basic approach is based on the edge-betweenness, which is 
a useful criterion to measure the edge importance in a network. 
Intuitively, the edge-betweenness measures the edge important with 
respect to the shortest paths in the network. The high betweenness 
suggests that the edge is involved into many shortest paths; and 
thus removing them will likely increase the distance of those pairs 
linked by these shortest paths. Here, we consider two variants of 
edge-betweenness: the (global) edge-betweenness (5) and the local 
edge-betweenness (6|. The global edge-betweenness is the original 
one [5 1 and is defined as follows: 

Me) 



B(e) 



where S 3t is the total number of shortest paths between vertex s 
and f , and S at (e) the total number of shortest paths between u and 
v containing edge e. 

The local edge-betweenness considers only those vertex pairs 
whose shortest paths are no greater than k, and is defined as 



LB(e) = 

s^teV,d(s,t)<k 



Me) 



The reason to use the local edge-betweenness measure is because in 
the de-small-world (and reachable pair cut) problem, we focus on 
those local reachable pairs (distance no greater than k). Thus, the 
contribution to the (global) betweenness from those pairs with dis- 
tance greater than k can be omitted. The exact edge-betweenness 
can be computed in 0(nm) worst case time complexity J2] where 
n = \V\ (the number of vertices) and m = \E (the number of 
edges) in a given graph, though in practical the local one can be 
computed much faster. 

Using the edge-betweenness measure, we may consider the fol- 
lowing generic procedure to select the L edges for E r : 

1) Select the top r < L edges into E r , and remove those edges 
from the input graph G; 

2) Recompute the betweenness for all remaining edges in the up- 
dated graph G; 

3) Repeat the above procedure \L/r\ times until all L edges are 
selected. 



Note that the special case r = 1, where we select each edge in 
each iteration, the procedure is very similar to the Girvan- Newman 
algorithm (5) in which they utilize the edge-betweenness for com- 
munity discovery. Gregory f6] generalizes it to use the local-edge 
betweenness. Here, we only consider to pickup L edges and allow 
users to select the frequency to recompute the edge-betweenness 
(mainly for efficiency consideration). The overall time complexity 
of the betweenness based approach is 0( \Ljr\nm) (assuming the 
exact betweenness computation is adopted). 

2.3 OR-AND Boolean Function and its Maxi- 
mization Problem 

In the following, we transform the de-small-world network prob- 
lem and express it as a OR-AND Boolean function maximization 
problem, which forms the basis for our optimization problem in 
next section. First, we will utilize the OR-AND graph to help rep- 
resent the de-small-world (reachable pair cut) problem. Let us de- 
note P the set of all the short paths in G that have length at most 
k. 




(a) Example Graph 



(b) OR-AND Graph 



Figure 2: OR-AND graph 

OR-AND Graph: Given a graph G = (V,E), the vertex set of an 
OR-AND graph Q — (V, £ ) is comprised of three kinds of nodes 
Ve, Vp and Vn G , where each node in Ve corresponds to a unique 
edge in E, each node in Vp corresponds to a short path in P, and 
each node in Viz G corresponds to a unique reachable pair in G 
(with respect to k). Figure [2(b)! shows those nodes for graph G in 
Figure [2(a)] The edge set consists of two types of edges: 1) Each 
short path node in Vp is linked with the vertices in Ve correspond- 
ing to those edges in in the path. For instance path node p 1 in Vp 
links to edge node ei and e? in Ve in Figure [2(b)] Each reach- 
able pair node in Viz G links to those path nodes which connects 
the reachable pair . For instance, the reachable pair bd is connected 
with path node p 1 and p 2 in Figure [2(bT| 

Intuitively, in the OR-AND graph, we can see that in order to cut 
a reachable pair, we have to cut all the short paths between them 
(AND). To cut one short path, we need to remove only one edge 
in that path (OR). Let P(u, v) consists all the (simple) short paths 
between u and v whose length are no more than k. For each short 
path p in P(u, v), let e corresponds to a Boolean variable for edge 
e 6 p: if e* = T, then the edge e; is not cut; if a = F, then the 
edge is cut (e^ £ E r ). Thus, for each reachable pair (u, v) € TZg, 
we can utilize the a Boolean OR-AND expression to describe it: 

i(u,v)= V A e ® 

p£P(u,v) e£p 

For instance, in the graph G (Figure [2(b)} , 

I(b, d) = (ei A e 2 ) V (e 3 A e 4 ) 
Here, I(b, d) — T indicating the pair is being cut only if for both 



p 1 and p 2 are cut. For instance, if ei = F and e$ = F, then 
7(6, d) = F; and ei = F, but e 3 = T and e 4 = T, I(b,d) = 
T. Given this, the de-small-world problem (and the reachable pair 
cut problem) can be expressed as the following Boolean function 
maximization problem. 

Definition 3. (Boolean Function Maximization Problem) 

Given a list of Boolean functions (such as I(u, v), where (u, v) G 
TZa), we seek a Boolean variable assignment where exactly L vari- 
ables are assigned false (e = F iff e € E r , and \E r \ = L), 
such that the maximal number of Boolean functions being false 
(I(u, v) — F corresponding to (it, v) is cut by E r ). 

Unfortunately, the Boolean function maximization problem is also 
NP-hard since it can directly express the general reachable pair cut 
problem. In the next section, we will introduce a numerical relation 
approach to solve this problem. 

3. PATH ALGEBRA AND OPTIMIZATION 
ALGORITHM 

In this section, we introduce a numerical relaxation approach to 
solve the Boolean function maximization problem (and thus the de- 
small-world problem). Here, the basic idea is that since the direct 
solution for the Boolean function maximization problem is hard, 
instead of working on the Boolean (binary) edge variable, we relax 
to it to be a numerical value. However, the challenge is that we 
need to define the numerical function optimization problem such 
that it meet the following two criteria: 1) it is rather accurately 
match the Boolean function maximization; and 2) it can enable nu- 
merical solvers to be applied to optimize the numerical function. 
In Subsection 13. II we introduce the numerical optimization prob- 
lem based on the path algebra. In Subsection 13.21 we discuss the 
optimization approach for solving this problem. 

3.1 Path- Algebra and Numerical Optimization 
Problem 

To construct a numerical optimization problem for the Boolean 
function maximization format of the de-small-world problem, we 
introduce the following path-algebra to describe all the short paths 
between any reachable pair in TZg- For each edge e in the graph 
G = (V,E), we associate it with a variable x e . Then, for any 
reachable pair (u,v) € TZg, we define its corresponding path- 
algebra expression V(u,v) as follows: 



V(u,v)= J2 Ii x - 



(3) 



Taking the path-algebra for (b, d) in Figure [2] and Figure [3] as ex- 
ample, we have 

P(b, d) — X2X1 + X3X4, 




Figure 3: Algebra Variable 



Basically, the path-algebra expression V(u,v) directly corre- 
sponds to the Boolean expression I(u,v) by replacing AND(A) 
with product (x), Oi?(V) with sum (+), and Boolean variable e 
with algebraic variable x e . Intuitively, V(u, v) records the weighted 
sum of each path in P(u, v), where the weight is the product based 
on the edge variable x e . Note that when x e = 1 for every edge e, 
when V(u,v) simply records the number of different short paths 
(with length no more than k) between u and v, i.e., V(u, v) = 
\P{u, v)\. Furthermore, if assuming x e > 0, then V(u,v) — 
is equivalent to in each path p £ P(u,v), there is at least one 
edge variable is equivalent to 0. In other words, assuming if vari- 
able x e = iff e = T, then P(u, v) = iff I{u, v) = F and 
V(u,v) > Oiff I{u,v) = T. 

Given this, we may be tempted to optimize the follow objec- 
tive function based on the path-algebra expression to represent the 
Boolean function maximization problem: 

v)eiz G P( u , v )- However, this does not accurately reflect our 
goal, as to minimize ^2^ u „) S 7j G P{ u , w )> we mav not need any 
V(u, v) = (which shall be our main goal). This is because 
V(u, v) corresponds to the weighted sum of path products. Can 
we use the path-algebra to address the importance of V(u, v) = 
in the objective function? 

We provide a positive answer to this problem by utilizing an ex- 
ponential function transformation. Specifically, we introduce the 
following numerical maximization problem based on the path ex- 
pression: 

J2 e ~ xv(v - v \where,0<x e <l,J2x e >X-L (4) 

Note that < e ~ xv{u - v) < 1 (each x e > 0), and only when 
P(u,v) = 0, e~ XT{u ' v) = 1 (the largest value for each term). 
When V(u, v) « 1, the term e ~ xvi - u ' v) can be rather small (ap- 
proach 0). The parameter A is the adjusting parameter to help con- 
trol the exponential curve and smooth the objective function. Fur- 
thermore, the summation constraint x e > X — L) is to express 
the budget condition that there shall have L variables with Xi ~ 0. 
Here X is the total number of variables in the objective function 
(X = \E\ if we consider every single edge variable x e )- 

3.2 Gradient Optimization 

Clearly, it is very hard to find the exact (or closed form) solu- 
tion for maximizing function in Equation 4 under these linear con- 
straints. In this section, we utilize the standard gradient (ascent) 
approach together with the active set method |[7] to discover a local 
maximum. The gradient ascent takes steps proportional to the pos- 
itive of the gradient iteratively to approach a local minimum. The 
active set approach is a standard approach in optimization which 
deals with the feasible regions (represented as constraints). Here 
we utilize it to handle the constraint in Equation 4. 
Gradient Computation: To perform gradient ascent optimization, 
we need compute the gradient g(x e ) for each variable x e . Fortu- 
nately, we can derive a closed form of g(x e ) in^^ u „) g7 j G e~ xv ^ u,v ^ 
as follows: 

g(x e )= f^Oegg = y ^ X P(u,v,e)e- xv ^\ 

where V(u, v, e) is the sum of the path-product on all the paths 
going through e and we treat x e = 1 in the path-product. More 
precisely, let P(u, v, e) be the set of all short paths (with length no 
more than k) between u and v going through edge e, and then, 

P(u,v,e)= II x *> (5) 

pfE P{u ,v ,e) e'£p\{e} 



Using the example in Figure|2]and Figure|3] we have 

V(b, d, ei) = x 2 

Note that once we have all the gradients for each edge variable 
x e , then we update them accordingly, 

Xe = Xe + /3g(Xe), 

where f3 is the step size (a very small positive real value) to control 
the rate of convergence. 

V(u, v) and V(u, v, e) Computation To compute the gradient, we 
need compute all V(u, v) and P(u, v, e) for (u, v) £ IZc- Espe- 
cially, the difficulty is that even compute the total number of simple 
short paths (with length no more than k) between u and v, denoted 
as \P(u, v)\ is known to be expensive. In the following, we de- 
scribe an efficient procedure to compute P(u, v) and P(u, v, e) 
efficiently. The basic idea is that we perform a DFS from each ver- 
tex u with traversal depth no more than k. During the traversal 
form vertex u, we maintain the partial sum of both V(u, v) and 
V(u, v, e) for each v and e where u can reach within k steps. After 
each traversal, we can then compute the exact value of V(u, *) and 
V(u, *, *). 



Algorithm 1 ComputePUVE(G, u, k,p, w) 



1: 


Input: G — (V, E) and starting vertex u; 


2: 


Input: spreading parameter k, path p, and partial product it); 


3: 


Output: P(u, *), P(u, *, *); 


4: 


if \p\ = k {traversal depth no more than k) then 


5: 


return 


6: 


end if 


7: 


z = s.topQ {the last visited vertex in the traveralj 


8: 


for each v £ Neighbor(z) and v £ p {simple path} do 


9: 


p.push(v) {the current path}; 


10: 


w <— w x {corresponding path product}; 


11: 


V(u, v) <— V(u, v) + w; 


12: 


for each e EE p {every edge in the current path} do 


13: 


P(u,v,e) ^P(u,v,e) + 


14: 


end for 


15: 


ComputePUVE(G, u,k,p,w); 


16: 


p.pop();w <- 


17: 


end for 



The DFS procedure starting from u to compute all P(u, *) and 
V(u, *, *) is illustrated in AlgorifhmQ] In AlgorifhmQ] we main- 
tain the current path (based on the DFS traversal procedure) in p 
and its corresponding product X^ e g P Xe l& maintained in variable 
w (Line 9 and 10). Then, we incrementally update P(u, v) assum- 
ing v is the end of the path p (Line 11). In addition, we go over 
each edge in the current path, and incrementally update V(u, v) 
(w/x e = Tle'£p\{e} x e' ' Line 13.) Note that we need invoke this 
procedure for every vertex u to compute all V(u, v) and V(u, v, e). 

Thus, the overall time complexity can be written as 0(|V|d fc ) for 
a random graph where d is the average vertex degree. 
Overall Gradient Algorithm 

The overall gradient optimization algorithm is depicted in Algo- 
rithm [2] Here, we use C to describe all the edges which need be 
processed for optimization. At this point, we consider all the edges 
and thus C = E. Later, we will consider to first select some can- 
didate edges. The entire algorithm performs iteratively and each 
iteration has three major steps: 

Step 1 (Lines 6 — 8): it calculates the gradient g(x e ) of for every 
edge variable x e and an average gradient ~g\ 



Algorithm 2 OptimizationAlg(G, L) 



1: Input: G = (V, E), and edge removal budget L; 

2: Output: edge set E r ; 

3: Ve g C (C = E), x e «- 1; {initialization} 

4: „4 «- 0; {active set} 

5: while NOT every x e converges do 

6: Vx e , calculate V(u, v) and V(u, v, e) using AlgorithmQ] 

7: Vx e ,g(x e ) <- -AE r(le) ^ PM P(",«,e); 

8: ,g -s |c\.4| {average gradient}; 

9: for each e g C \ .A do 

10: if bound reached C}2 eeC x e < \C\ — L) {using x e from 
last iteration) then 

x e <— max(x e + p(g(x e ) - g), 0); 



11 
12 
13 
14 
15 
16 
17 
18 
19 
20 

21 
22 
23 
24 



else 

X e ' 

end if 



'— max(x e + /3g(x e ),0); 
min(x e , 1); 



0) do 



Xe ^ 

end for 

for each e g C \ A and (x e — 1 or x e 

A <r- A U {e}; {add to active set} 
end for 

for each e g A and ((x e — A g(x e ) <= g) V (x e = 
1 Ap(e) >=g))do 

A .4\{e}; {remove from active set} 
end for 
end while 

sort all {x e } in increasing order, and add top L edges to E r ; 



Step 2 (Lines 9 — 16): only those variables are not in the active 
set A will be updated. Specifically, if the condition (^2 e£E x e > 
\E\ — L) is not met, we try to adjust x c back to the feasible region. 
Note that by using g(x e ) — g (Line 11) instead of g(x e ) (Line 
13), we are able to increase the value of those x e whose gradient 
is below average. However, such adjustment can still guarantee 
the overall objective function is not decreased (thus will converge). 
Also, we make sure x e will be between and 1. 
Step 3 (Lines 17 — 22): the active set is updated. When an edge 
variable reaches or 1, we put them in the active set so that we 
will not need to update them in Step 2. However, for those edges 
variables in the active set, if their gradients are less (higher) than 
the average gradient for x e — (x e = 1), we will release them 
from the active set and let them to be further updated. 

Note that the gradient ascent with the active set method guaran- 
tees the convergence of the algorithm (mainly because the overall 
objective function is not decreased). However, we note that in Al- 
gorithm[2] the bounded condition (^ 



> \E\ — L) may not 

be necessarily satisfied even with the update in Line 11. Though 
this can be achieved through additional adjustment, we do not con- 
sider them mainly due to the goal here is not to find the exact opti- 
mization, but mainly on identifying the smallest L edges based on 
Xe- Finally, the overall time complexity of the optimization algo- 
rithm is 0(t(\V\ *d + \E\)), given t is the maximum number of 
iterations before convergence. 

4. SHORT BETWEENNESS AND SPEEDUP 
TECHNIQUES 

In Section [3] we reformulate our problem into a numerical op- 
timization problem. We further develop an iterative gradient al- 
gorithm to select the top L edges in to E r . However, the basic 
algorithm can not scale well to very large graphs due to the large 



number (\E\) of variables involved. In this section, we introduce a 
new variant of the edge-betweenness and use it to quickly reduce 
the variables needed in the optimization algorithm (Algorithm |2j. 
In addition, we can further speedup the DFS procedure to compute 
V(u, v) and V(u, v, e) in Algorithm[T] 

4.1 Short Betweenness 

In this subsection, we consider the following question: What 
edge importance measure can directly correlate with x e in the ob- 
jective function in Eq. 4 so that we can use it to help quickly iden- 
tify a candidate edge set for the numerical optimization described 
in Algorithm^ In this work, we propose a new edge-betweenness 
measure, referred to as the short betweenness to address the this 
question. It is intuitively simple and has an interestingly relation- 
ship with respect to the gradient g(x e ) for each edge variable. It 
can even be directly applied for selecting E r using the generic pro- 
cedure in Section|2]and is much more effective compared with the 
global and local edge-betweenness which measure the edge impor- 
tance in terms of the shortest path (See comparison in Section|5j- 

Here we formally define V(ej) as short betweenness. 

Definition 4. (Short Betweenness:) The short betweenness 



SB{e) for edge e is as follows, SB(e) = 



\P(u,v,e)\ 

(u,v)eiz G |p(u,«)| • 



Recall that («, v) g TZg means d(u, v) < k; \P(u, v)\ is the num- 
ber of short paths between u and v; and \P(u, v, e) \ is the number 
of short paths between u and v which must go through edge e. The 
following lemma highlights the relationship between the short be- 
tweenness and the gradient of edge variable x e : 

LEMMA 2. Assuming for all edge variables x e = 1, then g(x e ) > 
-SB(e). 

Proof Sketch: 

g{e) = ~mu,v,e)e~ xv ^ 
(u,v)eQR 

]T -A|P( W ,w,e)|e- A|p( ^ )l (Ve, i r e = l) 

(».»)66b 

(u,v)eS R 
= -SB(e) 

□ 

Basically, when x e = 1 for every edge variable x e (this is also 
the initialization of Algorithmic}, the (negative) short betweenness 
serves a lower bound of the gradient g(e). Especially, since the 
gradient is negative, the higher the gradient of \g(e) \ is, the more 
likely it can maximize the objective function (cut more reachable 
pairs in IZg- Here, the short betweenness SB(e) thus provide an 
upper bound (or approximation) on \g (e) | (assuming all other edges 
are presented in the graph); and measures the the edge potential in 
removing those local reachable pairs. Finally, we note that Algo- 
rifhm[T]can be utilized to compute \P(u, v)\ and \P(u, v, e)\, and 
thus the short betweenness (just assuming x e — 1 for all edge vari- 
ables). 

Scaling Optimization using Short Betweenness: First, we can 
directly utilize the short betweenness to help us pickup a candidate 
set of edge variables, and then Algorithm [2] only need to work on 
these edge variables (considering other edge variables are set as 
1). Basically, we can choose a subset of edges E 3 which has the 
highest short betweenness in the entire graph. The size of E s has 
to be larger than L; in general, we can assume \E S \ = aL, where 



a > 1. In the experimental evaluation (Section[5]l, we found when 
a — 5, the performance of using candidate set is almost as good as 
the original algorithm which uses the entire edge variables. Once 
the candidate set edge set is selected, we make the following simple 
observation: 

LEMMA 3. Given a candidate edge set E s C E, if any reach- 
able pair (u,v) G IZg con be cut by E r where E r C E a and 
\E r \ — L, then, each path in P(u,v) must contains at least one 
edge in E s . 

Clearly, if there is one path in P(u, v) does not contain an edge 
in E a , it will always linked no matter how we select E r and thus 
cannot cut by E r C E s . In other words, (u, v) has to be cut by 
E a if it can be cut by E r . Given this, we introduce 7Z S = IZg C 
TZ-g\e b ■ Note that 1Z S can be easily computed by the DFS traversal 
procedure similar to Algorithm [T] Thus, we can focus on optimiz- 
ing 

e ~ x ™, where, < x e < l,^a; e > X - L (6) 

(u,v)£TZ s 

Furthermore, let Ep = II, I J , „, , p, which records 

those edges appearing in certain path linking a reachable pair cut by 
Ep. Clearly, for those edges in E \ Ep, we can simply prune them 
from the original graph G without affecting the final results. To 
sum, the short betweenness measure can help speed up the numeri- 
cal optimization process by reducing the number of edge variables 
and pruning non-essential edges from the original graph. 

5. EXPERIMENTAL STUDY 

In this section, we report the results of the empirical study of 
our methods. Specifically, we are interested in the performance (in 
terms of reachable pair cut) and the efficiency (running time). 
Performance: Given a set of edges E r with budget L, the total 
number of reachable pairs being cut by E r is \1Zg | — \R.G\E r I or 
simply A | IZg | ■ We use the average pair being cut by an edge, i.e., 
8 = a I^-gI as the performance measure. 
Efficiency: The running time of different algorithms. 
Methods: Here we compare the following methods: 

1) Betweenness based method, which is defined in terms of the 
shortest paths between any two vertices in the whole graph G; here- 
after, we use BT to denote the method based on this criterion. 

2) Local Betweenness based method , which, compared with be- 
tweenness method(BT), takes only the vertex pair within certain 
distance into consideration; hereafter, we use LB to stand for the 
method based on local betweenness. 

3) Short Betweenness based method, the new betweenness intro- 
duced in this paper, which considers all short paths whose length is 
no more than certain threshold. Here we denote the method based 
on short betweenness as SB. 

4) Numerical Optimization method, which solves the de-small- 
world problem iteratively by calculating gradients and updating the 
edge variables x e . Based on whether the method use the candidate 
set or not, we have two versions of optimization methods: OMW 
(Optimization Method With candidate set) and OMO (Optimiza- 
tion Method withOut candidate set). Note that we normally choose 
the top 5L edges as our candidate set. 

As mentioned before in Section [2] we have a generic procedure 
to select L edges depending on parameter r (batch size). We found 
for different methods BT, LB and SB, the effects of r seem to be 
rather small (as illustrated in Figure 1771. Thus, in the reminder of 
the experiments, we choose r — L, i.e., we select the top L edges 
using the betweenness calculated for the entire (original graph). 
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All the algorithms are implemented using C++ and the Standard 
Template Library (STL), and the experiments are conducted on a 
2.0GHz Dual Core AMD Opteron CPU with 4.0GB RAM running 
on Linux. 

5.1 Result on Synthetic Datasets 

In this subsection, we study the performance and efficiency of 
different methods on the synthetic datasets. Here, we generate var- 
ious synthetic networks from two well-known small-world models: 
Watts and Strogatz model (WS model) [ 19 1 generating small-world 
graphs by interpolating between ER graph and a regular ring lattice; 
the small-world model proposed by Kleinberg ||10| (KS model). 
Then, the networks generated from WS model, KS model are re- 
ferred to as the WS network and KS network, respectively. 

In the following, we conducted three groups of experiments. 
Varying |V|: In this group of experiments, we generate networks 
respectively with the two models (WS, KS) using vertex size lk, 
5k and lOfc. We also set the edge budget L — 1000 (edge re- 
moval budget) and k = 3 (spreading parameter). The results are 
summarized in Figure [T0(a)| and [l0(e)| From these two figures, we 
can see that LB method always produces the worst result (its 8 is 
around 100, meaning each edge on average contribute to around 
100 reachable pairs). Meantime, 8 for BT method increases dra- 
matically from 150 to 300 for both KS and WS graphs. Compar- 
atively, OMW always reduces the biggest number of pairs com- 
pared with other methods. More specifically, its 8 grows from 175 
to more than 400. Meanwhile, SB method produces the similar 
result as OMW method. This suggests the power of short be- 
tweenness (which directly forms an upper bound for the absolute 
gradient g{x e )). 

Varying L: In this group of experiments, we study the reduction 
effect for different L and the result for KS model is reported in 
Figure [T0(b)| Generally, with the increase of L, 8 decreases. This 
is reasonable because more reachable pairs is removed, each edge 
can remove the smaller number of reachable pairs. For the specific 
algorithms, similar to the situation in last group of experiment, LB 
and OMW methods produce the lowest 8 and highest 8, respec- 
tively. Then the number of reduced reachable pairs by BT method 
is about three times that of LB, and is about | of that reduced by 
SB and OMW . These cases also happen for the graphs generated 
by WS model as in Figure [l0(f)| 

Varying k: Remember that we define the short path as the paths 
with length at most k. Given G, obviously k determines the size 
of reachable pairs. Given different k, the result of all algorithms 
are reported in Figure |10(c)| for KS model graphs. We can see 
that generally, with the increase of k, the strength of each edge(<5) 
increases. This is understandable because with k increasing, each 
edge could effect more reachable pairs. For the specific algorithms, 
LB produces the lowest 8 for all k. Then other three methods 
produce similar 8, which are normally about four times between 
than LB. The similar situation happens for WS graphs as in Figure 



Figure 4: Network Statistics 
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5.2 Result on Real Datasets 

In this subsection, we study the performance of our algorithms 
on real datasets. The benchmarking datasets are listed in Figure|4] 
All networks contain certain properties commonly observed in so- 
cial networks, such as small diameter. All datasets are download- 
able from Stanford Large Network Dataset Collection^]. 

In Figure[4] we present important characteristics of all real datasets, 
where n is graph diamter. All these nine networks are snapshots of 
the Gnutella peer to peer file sharing network starting from August 
2002. Nodes stand for the hosts in the Gnutella network topology 
and the edges for the connections between the hosts. 
Varying L: We perform this group of experiments on dataset G?iit05 
and we fix k = 3. Here we run these methods on three different 
edge buget L: 500, 1000 and 2000. The result is reported in Ta- 
ble [8] The general trend is that with smaller L, 8 becomes big- 
ger. This is because the set of reachable pairs removed by different 
edges could have intersection; when one edge is removed, the set 
of reachable pairs for other edges is also reduced. For particular 
methods, BT and OMO methods produces the lowest and highest 
8, and the different between OMW and OM O is very small. 
Varying k: In this group of experiments, we fix L = 1000 and we 
choose GnuOi. Here we choose three values for k: 2, 3 and 4. The 
result is reported in Table [9] From the result, we can see that when 
k becomes bigger, 8 become higher. This is also reasonable:when k 
becomes bigger, more reachable pairs are generated and meanwhile 
\E\ is constant; therefore, each edge is potentially able to remove 
more reachable pairs. From the above three groups of experiments, 
we can see that OM O does not produce significant results com- 
pared with OMW . Therefore, in the following experiment, we do 
not study OMO method again. 

8 on all real datasets: In this groups of experiment, we study the 
performance of each method on these nine datasets, with L being 
proportional to \E\. Specifically, L = \E\ x 1%. We report the 
result in Figure [??] LB generally produces the lowest 8, around 
half that of BT; and also the best method, is the 5*73 and OMW 
methods. Specifically, OMW is always slightly better than SB. 

6. CONCLUSION 

In this paper, we introduce the de-small-world network problem; 
to solve it, we first present a greedy edge betweenness based ap- 
proach as the benchmark and then provide a numerical relaxation 
approach to slove our problem using OR-AND boolean format, 
which can find a local minimum. In addition, we introduce the 
short-betweenness to speed up our algorithm. The empirical study 
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Figure 5: 8 for all real datasets 

demonstrates the efficiency and effectiveness of our approaches. In 
the future, we plan to utilize MapReduce framework(e.g. Hadoop) 
to scale our methods to handle graphs with tens of millions of ver- 
tices. 
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Figure 9: S By Varying k 
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APPENDIX 

A. PROOF OF THEOREM 1 

To prove this theorem, we first introduce the dense n-sub-hypergraph 
problem. Let H = (Vh, Eh) be a hypergraph, where Vh is the 
vertex set, and Eh is the hyperedge set, such that en £ Eh and 
eh C V . Each hyperedge in the hypergraph is a subset of vertices 
(not necessarily a pair as in the general graph). Furthermore, given 
a subset of vertices V s C Vh, if an hypergraph C Vs, then, we 
say this hyperedge is covered by the vertex subset V 3 . 

Definition 5. Dense K-Sub-Hypergraph Problem: Given 
hypergraph G = (Vh,Eh) and a parameter K, we seek to find 
a subset of vertices V B C Vh and \ V a \ — n, such that the maximal 
number of hyperedges in Eh is covered by V s . 

Dense K-hyper-subgraph problem can be easily proven to be NP- 
Hard, because its special case, dense K-subgraph problem (each 
edge e C V x V) has been shown to be NP-Hard |4|. 

Proof Sketch: To reduce the Dense K-Sub-Hypergraph prob- 
lem to our problem, we construct the following graph from the 
given hypergraph H = (Vh,Eh). FigurJTTl illustrates the trans- 
formation. For each hyperedge e, G Eh, which consists a set 
Pi = {vi 1 , Vi 2 , ■ ■ ■ ,Vi k } of vertices in Vh, we represent it as a 
vertex pair pf and p* in the graph G, and each pair is connected 
by ik different paths with length 3, where each middle edge in G 
corresponds to a unique vertex in Vh- For instance for hyperedge 
pi = {a, b, c}, the middle edges of the three-paths in G correspond 




to edges a, b, c. To facilitate our discussion, the middle edges of 
these length-3 paths linking the vertex pairs corresponding to each 
hypergraph in H are referred to MS, which has one-to-one corre- 
spondence to the vertex set Vh in the hypergraph. 
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Figure 1 1 : Graph for Reachable Pair cut Problem 



Now we show for the dense k sub-hypergraph problem, its opti- 
mal solution can be solved by an instance of the general reachable 
pair cut problem, where L — k and RS consists of all the (pi, pi) 
reachable pairs (k = 3). In other worlds, \RS\ = \Eh\- Specifi- 
cally, we need show that if a subset of edges S (\S\ = L) in MS 
can maximally disconnect the reachable pairs in RS, then its cor- 
responding vertex subset Vs can maximally cover the hyperedges. 
This is easy to observe due to the one-to-one correspondence rela- 
tionship between MS and Vh and between RS and Eh- Given 
this, we need show that the optimal solution of the reachable pair 
cut problem can be always found using only edges in MS. Sup- 
pose the edge set ES' with size L is the optimal solution which 
contain some edge e £ ES, and e ^ MS. In this case, we can 
simply replace e with its adjacent middle edge e' from MS in the 
result set. This is because the replacement ES'\{e} U {e'} will 
still be able to disconnect all the reachable pairs in RS being cut 
by ES' . Note that the middle edge e' can cut more paths than e 
(form a superset of the paths cut by e) with respect to RS. □ 



